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Abstract 

Synchronization of rotations is the problem of estimating a set of rotations FU G SO(n),i = 
1 ... N based on noisy measurements of relative rotations RtRj. This fundamental problem 
has found many recent applications, most importantly in structural biology. We provide a 
framework to study synchronization as estimation on Riemannian manifolds for arbitrary n 
under a large family of noise models. The noise models we address essentially encompass 
zero-mean isotropic noise, and we develop tools for Gaussian-like as well as heavy-tail types of 
noise in particular. As a main contribution, we establish formulas for the Fisher information 
matrix and associated Cramer-Rao bounds of synchronization. We find that these bounds are 
structured by the pseudoinverse of the measurement graph Laplacian, where edge weights are 
proportional to measurement quality. We leverage this to provide interpretation and visual- 
ization tools for these bounds in both the anchored and anchor-free scenarios. Similar bounds 
previously established were limited to rotations in the plane and Gaussian-like noise. 

Keywords: Synchronization of rotations, estimation on manifolds, estimation on graphs, 
graph Laplacian, Fisher information, Cramer-Rao bounds, distributions on the rotation group, 
Langevin. 
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Introduction 

Synchronization of rotations is the problem of estimating rotation matrices . . . , Rn based on 
noisy measurements of relative rotations R^Rj. The set of available measurements gives rise to 
a graph structure, where the N nodes correspond to the rotations Ri and an edge is present 
between two nodes i and j if a measurement of RiRj is given. Depending on the application, 
some of the rotations may be known in advance or not. The known rotations, if any, are called 
anchors. In the absence of anchors, it is only possible to recover the rotations up to a global 
rotation, since the measurements only reveal information about relative rotations. Motivated by 
the pervasiveness of synchronization of rotations in applications, we propose a derivation and 
analysis of Cramer-Rao bounds for this estimation problem. Our results hold for rotations in 
SO(n) = {R G R" xn : R T R = I n and deti? = 1} for arbitrary n and for a large family of 
practically useful noise models. 

Synchronization of rotations appears naturally in a number of important applications. Tron 
and Vidal for example consider a network of cameras (37^ . Each camera has a certain position 
in K 3 and orientation in SO (3). For some pairs of cameras, a calibration procedure produces a 



noisy measurement of relative position and relative orientation. The task of using all relative ori- 
entation measurements simultaneously to estimate the configuration of the individual cameras is a 
synchronization problem. Cucuringu et al. address sensor network localization based on inter-node 
distance measurements (l3j . In their approach, they decompose the network in small, overlapping, 
rigid patches. Each patch is easily embedded in space thanks to its rigidity, but the individual 
embeddings are noisy. These embeddings are then aggregated by aligning overlapping patches. 
For each pair of such patches, a measurement of relative orientation is produced. Synchronization 
permits the use all of these measurements simultaneously to prevent error propagation. In related 



work, a similar approach is applied to the molecule problem Tzeneva et al. applied synchro- 
nization to the construction of 3D models of objects based on scans of the objects under various 
unknown orientations [38^ . Singer and Shkolnisky study cryo-EM imaging [351 ] . In this problem, 
the aim is to produce a 3D model of a macro-molecule based on many projections (pictures) of the 
molecule under various random and unknown orientations. A procedure specific to the cryo-EM 
imaging technique helps estimating the relative orientation between pairs of projections, but this 
process is very noisy. In fact, most measurements are outliers. The task is to use these noisy 
measurements of relative orientations of images to recover the true orientations under which the 
images were acquired. This naturally falls into the scope of synchronization of rotations, and calls 
for very robust algorithms. 

1.1 Previous work 

Tron and Vidal, in their work about camera calibration, develop distributed algorithms based on 
consensus on manifolds to solve synchronization on R 3 x SO(3) |37(. Singer studies synchronization 
of phases, that is, rotations in the plane, and reflects upon the generic nature of synchronization as 
the task of estimating group elements g±, . . . , giq based on measurements of their ratios [34| . 
In that work, the author focuses on synchronization in the presence of many outliers in the mea- 
surements and establishes that synchronization is surprisingly robust against such outliers. Bounds 
are derived based on information-theoretic arguments to establish how many measurements need to 
be accurate for synchronization to be possible. A fast algorithm based on eigenvector computations 
is shown to obtain good solutions. In particular, it performs better than random estimation as 
soon as the information-theoretic threshold is reached. In further work, Bandeira et al. go deeper 
in the analysis and generalize the bounds and the algorithms to rotations in 1™ Q . Hartley et al. 
develop a robust algorithm based on a Weiszfeld iteration to compute Ll-means on SO(n), and 
extend the algorithm to perform robust synchronization of rotations [l8|. Russel et al. develop 
a decentralized algorithm for synchronization on the group of translations R n [3l[. Barooah and 
Hespanha study the covariance of the BLUE estimator for synchronization on M. n with anchors. 
This covariance coincides with the Cramer-Rao bound (CRB) under Gaussian noise. They give 
interpretations of the covariance in terms of the resistance distance on the measurement graph [H[ . 
Howard et al. study synchronization on the group of translations IR™ and on the group of phases 
SO(2) [ill ]. They establish CRB's for synchronization in the presence of Gaussian-like noise on 
these groups and provide decentralized algorithms to solve synchronization. Their derivation of 
the CRB's seems to rely heavily on the commutativity (and thus flatness) of K™ and SO(2), and 
hence does not apply to synchronization on SO(n) in general. Furthermore, they only analyze 
Gaussian-like noise. 

CRB's are a classical tool in estimation theory [2!| that provide a lower-bound on the variance 
of any unbiased estimator for an estimation problem, based on the measurements' distribution. 
The classical results focus on estimation problems for which the sought parameter belongs to a 
Euclidean space. In the context of synchronization, this is not sufficient, since the parameters 
we seek belong to the manifold of rotations SO(n). Important work by Smith [36( ] as well as by 



Xavicr and Barroso [41[ extends the theory of CRB's to the realm of manifolds in a practical way. 
Recall that, in the absence of anchors, synchronization can only be solved up to a global rotation. 
This ambiguity, as we shall see, gives rise to a singular Fisher information matrix (FIM). As the 
CRB's are usually expressed in terms of the inverse of the FIM, this is problematic. Xavier and 
Barroso provide a nice geometric interpretation in terms of estimation on quotient manifolds of 
the well-known fact that one may use the pseudoinverse of the FIM in such situations [4(j. It is 
then apparent that to establish CRB's for synchronization, one needs to either impose anchors, 
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leading to a submanifold geometry, or work on the quotient manifold. In [7|, an attempt is made 
at providing a unified framework to build such CRB's. We use these tools in the present work. 

Other authors have established CRB's for sensor network localization (SNL) and synchroniza- 
tion problems. Ash and Moses study SNL based on inter-agent distance measurements, and notably 
give an interpretation of the CRB in the absence of anchors 0] . Chang and Sahai tackle the same 
problem |10l|. As we mentioned earlier, Howard et al. derive CRB's for synchronization on R™ and 
SO(2) |2lj. CRB's for synchronization on R™ are re-derived as an example in Q. A remarkable 
fact is that, for all these problems of estimation on graphs, the pseudoinverse of the graph Lapla- 
cian plays a fundamental role in the CRB — although not all authors explicitly reflect on this. As 
we shall see, this special structure is rich in interpretations, many of which exceed the context of 
synchronization of rotations specifically. 

1.2 Contributions and outline 

In this work, we state the problem of synchronization of rotations in SO(n) for arbitrary n as 
an estimation problem on a manifold — Section [3] We describe the geometry of this manifold 
both for anchored synchronization (giving rise to a submanifold geometry) and for anchor-free 
synchronization (giving rise to a quotient geometry) — Section [21 Among other things, this paves 
the way for maximum- likelihood estimation using optimization on manifolds [l[ , which will be the 
focus of further work. 

We describe a family of noise models (probability density functions) on SO(n) that fulfill a few 
assumptions — Section UJ We show that this family is both useful for applications (it essentially 
contains zero-mean, isotropic noise models) and practical to work with (the expectations one is 
lead to compute via integrals on SO(n) are easily converted into classical integrals on R n ). In 
particular, this family includes a kind of heavy-tailed distribution on SO(n) that appears to be 
new. We describe this distribution and we are convinced it will prove useful for other estimation 
problems on SO(n) with outliers. 

In Section^ we derive the Fisher information metric (FIM) for synchronization and establish 
it is structured by the Laplacian of the measurement graph, where edge weights are proportional 
to the quality of their respective measurements. The FIM plays a central role in the Cramer- 
Rao bounds (CRB) we establish for anchored and anchor-free synchronization — Section \E[ The 
CRB's are structured by the pseudoinverse of the Laplacian of the measurement graph. We derive 
clear interpretations of these bounds in both cases (with visualization tools), and note they differ 
significantly. 

As a main result for anchored synchronization, we show that for any unbiased estimator Ri of 
the rotation Ri, asymptotically for small errors, 



where dist is the geodesic distance on SO(n), d = n(n — l)/2, La is the Laplacian of the weighted 
measurement graph with rows and columns corresponding to anchors set to zero and f denotes the 
Moore-Penrose pseudoinverse — see (|6.10[) . The better a measurement is, the larger the weight on 
the associated edge is — see (|5.20|) . This bound holds in a small-error regime under the assumption 
that noise on different measurements is independent, that the measurements are isotropically 
distributed around the true relative rotations and that there is at least one anchor in each connected 
component of the graph. 

As a main result for anchor- free synchronization, we show that for any unbiased estimator R { Rj 
of the relative rotation asymptotically for small errors, 



where L is the Laplacian of the weighted measurement graph and e, is the i column of the N x N 
identity matrix sec (|6.24D . This bound holds in a small-error regime under the assumption that 
noise on different measurements is independent, that the measurements are isotropically distributed 
around the true relative rotations and that the measurement graph is connected. 




(1.1) 




(1.2) 



3 



Section [7] hosts a few comments on the CRB's. In particular, a link with the Fiedler value 
of the graph is described and the robustness of synchronization versus outliers is confirmed, via 
arguments that differ from those in [34 1 . 



2 Synchronization of rotations 

Synchronization of rotations is the problem of estimating a set of rotations R±, . . . , Rn £ SO(n) 
from noisy measurements of some of the relative rotations RiRj- In this section, we model syn- 
chronization as an estimation problem on a manifold, since the set of rotations SO(n) is a Lie 
group, i.e., a group and a manifold at the same time. 

In our estimation problem, the target quantities (the parameters) are the rotation matrices 

. . . , Rn G SO(n). The natural parameter space is thus: 

V = SO(n) x • • • x SO(n) (N copies). (2.1) 

Let [N] = {1, .. . ,N}. Consider a set £ C [N] x [N] such that (i, j) e £ =S> i ^ j and e £. 
This set defines an undirected graph over N nodes, 

G = ([N],£) (the measurement graph). (2-2) 

For each edge G £ , i < j,we have a measurement € SO(n) of the form 

H ij = Z ij R i R ]i ( 2 - 3 ) 

where Z^ is a random variable distributed over SO(n) following a probability density function (pdf ) 
fij : SO(n) — > R + , with respect to the Haar measure fi on SO(n) — see Sectional For example, 
when is detcrministically equal to the identity matrix /, the measurement is perfect (noise- 
free), whereas when Zij is uniformly distributed over (the compact set) SO(n), the measurement 
contains no information. We say that the measurement is unbiased, or that the noise has zero- 
mean, if the mean value of Z^ is the identity matrix — a notion we make precise in Section H We 
also say that noise is isotropic if its probability density function is only a function of distance to 
the identity. Different notions of distance on SO(n) yield different notions of isotropy. In Section^] 
we give a few examples of useful zero- mean, isotropic distributions on SO(n). 

Pairs (i,j) and (j, i) in £ refer to the same measurement. By symmetry, for i < j, we define 
Hj i = Z^RjRj = H~[j and the random variable Zji and its density fji are defined accordingly in 
terms of /y and Z^. In particular, 

Zji = RjRjzJjRtR}, and fij(Zij) = fji(Zji). (2.4) 

The pdf's fij and fji are linked as such because the Haar measure fi is invariant under the change 
of variable relating Zij and Zji — see Sectional 

In this work, we restrict our attention to noise models that fulfill the three following assump- 
tions: 

Assumption 1 (smoothness and support). Each pdf fij : SO(n) — > R+ = (0,+oo) is a smooth, 
positive function. 

Assumption 2 (independence). The Zij 's associated to different edges of the measurement graph 
are independent random variables. That is, if (i,j) ^ {p,q) and (i,j) ^ (q,p), Zij and Z pq are 
independent. 

Assumption 3 (invariance) . Each pdf fij is invariant under orthogonal conjugation, that is, 
VZ € SO(n), VQ G 0(n), fij(QZQ T ) = fij(Z). We say fij is a spectral function, since it only 
depends on the eigenvalues of its argument. The eigenvalues of matrices in SO(2fc) have the form 
e ±iei , . . . ,e illSk , with < 9\,...,9k < 7T. The eigenvalues of matrices in SO(2fc + 1) have an 
additional eigenvalue 1. 
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Assumption [T] is satisfied for all the noise models we consider; it could be relaxed to some 
extent but would make some of the proofs more technical. Assumption [5] is necessary to make 
the joint pdf of the whole estimation problem easy to derive, leading to an easy expression for the 
log-likelihood function. As we will see in Section [5l it is also at the heart of the nice Laplacian 
structure of the Fisher information matrix. Assumption [3] is a technical condition that will prove 
useful in many respects. One of them is the observation that pdf 's that obey Assumption [3] are 
easy to integrate over SO(n). We expand on this in Section @J where we also show that a large 
family of interesting pdf 's satisfy these assumptions, namely, zero-mean isotropic distributions. 

Under Assumption [21 the log-likelihood of an estimator R = . . . , Rjy) £ V. given the 
measurements ffy , is given by: 

i 1 N 

L(R) = - logfe(W^)=2EE lo S/y(W^ ( 2 - 5 ) 

(t,j)e£ »=i jeVi 

where Vi C [N] is the set of neighbors of node i, i.e., j 6 V< (*,.?) € £. The coefficient 1/2 reflects 
the fact that measurements and give the same information and are deterministically 

linked. Under Assumption [TJ L is a smooth function on the smooth manifold V '. 
The log-likelihood function is invariant under a global rotation. Indeed, 

VReP, VQe SO(n), L(RQ)=L(R), (2.6) 

where RQ denotes (RiQ, . . . , RnQ) £ V . This invariancc encodes the fact that all sets of rotations 
of the form RQ for Q £ SO(n) yield the same distribution of the measurements Hij, and are hence 
indistinguishable, i.e., rotations can only be recovered up to a global rotation. 

To resolve the ambiguity, one can follow at least two courses of action. One is to include 
additional constraints, most naturally in the form of anchors, i.e., assume some of the rotations are 
knowr0. The other is to acknowledge the invariance by working on the associated quotient space. 

Following the first path, the parameter space becomes a Riemannian submanifold of V . Follow- 
ing the second path, the parameter space becomes a Riemannian quotient manifold of V . In the 
next section, we describe the geometry of both. In Section HI we review useful tools to compute in- 
tegrals on SO(n) and describe probability density functions that are both useful to model practical 
problems and fulfill our assumptions. In Scction[3]wc use the expression of the log-likelihood func- 
tion (|2.5[) to derive the Fisher information matrix for synchronization, which plays a central role 
in deriving the Cramer- Rao bounds for synchronization with and without anchors — see Section [5] 

Remark 1 (A word about other noise models). We show that measurements of the form = 
Z i ^ 1 R i R^Z i ^ 2 j with Zij^i and Zij^ two independent random rotations with pdf's satisfying As- 
sumptions 1-3, satisfy the noise model considered in the present work. In doing so, we use some 
material from Section [7J For notational convenience, let us consider H = Z1RZ2, with Z\, Zi two 
independent random rotations with pdf's f±, $2 satisfying Assumption 1-3, R £ SO(n) fixed. Then, 
the pdf of H is the function h : SO(n) —> M + given by (essentially) the convolution of fi and f2 on 
SO(n): 

h(H)= [ f 1 (Z)f 2 (R T Z T H)dfi(Z)= [ f x {Z)h{Z T HR' v )^{Z), (2.7) 

JSO(n) JSO(n) 

where we used that fi is spectral: f2(R T Z T H) = f2{RR T Z T HR T ) . Let Z eq be a random rota- 
tion with smooth pdf f cq . We will shape f eq such that the random rotation Z eq R has the same 
distribution as H . This condition can be written as follows: for all measurable subsets S C SO(n), 

[ h(Z)dii(Z)= [ f eq (Z)dfi(Z)= f f cq (ZR T )dn(Z), (2.8) 

JS JSR T JS 

1 If we only know that Ri is close to some matrix R, and not necessarily equal to it, we may add a phony node 
Rn+i anchored at R, and link that node and Ri with a high confidence measure ii^jv+i = In- This makes it 
possible to have "soft anchors" . 
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where, going from the second to the third integral, we used the change of variable Z := ZR T and 
the bi-invariance of the Haar measure fi. In words: for all S, the probability that H belongs to 
S must be the same as the probability that Z cq R belongs to S . This must hold for all S , hence 
f eq (Z cq R T ) = h(Z cq ), or equivalently: 

f cq (Z cq ) = h(Z cq R)= [ f 1 (Z)f 2 (Z T Z cq )d^(Z). (2.9) 

JSO(n) 

This uniquely defines the pdf of Z cq . It remains to show that f cq is a spectral function. For all 
Q e 0(n), 

f cq (QZ cq Q T )= f h{Z)f 2 {Z T QZ eq Q T )dn{Z) (2.10) 

JSO(n) 



(h is spectral) =[ f 1 (Z)f 2 (Q T Z T QZ cq )d t i(Z) (2.11) 

JSO(n) 

(change of variable: Z :=QZQ T ) = f 1 (QZQ T )f 2 (Z T Z cq ) dfi(Z) (2.12) 

JSO(n) 



(h is spectral) = f 1 {Z)f 2 (Z'Z cq )d^Z)=f eq (Z cq ). (2.13) 

JSO(n) 

Hence, the noise model H JA = Z-- , R, ; RjZ;, , can be replaced with the model H 4 , = Z 44 „„i£.i?J and 
the pdf of Zij yeq is such that it falls within the scope of the present work. 

In particular, if f\ is a point mass at the identity, so that H = RZ 2 ( noise multiplying the 
relative rotation on the right rather than on the left), f cq = f 2 , so that it does not matter whether 
we consider = Z S -R S R\ or H,, = RjRjZ,-: they have the same distribution. 

3 Geometry of the parameter spaces 

Wc start with a quick reminder of the geometry of SO(n). It is assumed that the reader is familiar 
with standard concepts from Riemannian geometry [l], @, [28|. We then go on to describe the 
parameter spaces for the anchored and the anchor-free cases of synchronization. 

The group of rotations SO(?i) = {Q £ R™ xn : Q T Q = I and det Q = 1} is a connected, compact 
Lie group. Its Lie algebra is the set of skew-symmetric matrices: 

so(n) = {O e « nxn : n + n T = 0}. (3.1) 

This is also the tangent space at the identity TfSO(n) = So(n). The tangent space at Q e SO(n) 
is linked to so(n) via: 

TgSO(n) = Qso(n) = {QQ : Q e so(n)}. (3.2) 

As usual, we consider SO(n) as a Riemannian submanifold of the general linear group, and hence 
endow it with the usual metric: 

(Qfii,Qfi 2 > g = (fii,fi2>=taace(fiJfi a ). IMq = (QV, Q^)q = II^IIf • ( 3 - 3 ) 

For greater readability, we often omit the subscripts Q. The orthogonal projector from the em- 
bedding space M. nxn onto the tangent space TgSO(n) is: 

Pq(H) = Q sk(Q T H) , with sk(A) = (A - A T )/2. (3.4) 

The exponential map and the logarithmic map accept simple expressions in terms of matrix expo- 
nential and logarithm: 

Exp Q (QO) = Qcxp(r!), Log Ql (Q 2 ) = Qi log(QjQ 2 ). (3.5) 

The mapping 1 1— > Expg(tQri) defines a geodesic curve on SO(n), passing through Q with velocity 
QQ at time t = 0. Geodesic curves have zero acceleration and may be considered as the equivalent 
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of straight lines on manifolds. The logarithmic map Logg is (locally) the inverse of the exponential 
map Expg. The Riemannian (or geodesic) distance on SO(n) is the length of the shortest path 
(the geodesic arc) joining two points: 

dist(Qi,Q a ) = ||Log Ql (Q 2 )|| Qi = ||log(Q?Q a )|| p . (3.6) 

In particular, for rotations in the plane (n = 2) and in space (n = 3), the Riemannian distance 
between Qi and Q2 is >/29, where 8 e [0, n] is the angle by which QjQ 2 rotates. 

Let / : M" x ™ — > R be a differentiable function, and let / = /|so(n) be its restriction to SO(n). 
The gradient of / is a tangent vector field to SO(n) uniquely defined by: 

(grad/(Q), Qfi) = D/(Q)[Qil] Vil e 00 (n), (3.7) 

with grad/(Q) E TgSO(n) and T)f(Q)[QVl] the directional derivative of / at Q along Qil. Let 
V/(Q) be the usual gradient of / in R" x ™. Then, the gradient of / is easily computed as [l], 
cq.(3.37)]: 

grad/(Q)=P Q (V/(Q)). (3.8) 

In the sequel, we often write V/ to denote the gradient of / seen as a function in M. nxn , even if it 
is defined on SO(n). 

The parent parameter space for synchronization is the product Lie group V = SO(n) W . Its 
geometry is trivially obtained by element-wise extension of the geometry of SO(n) just described. 
In particular, tangent spaces and the Riemannian metric are given by: 

T R V = {Ril = (i?^, . . . , R N n N ) : ill, .... fijv € so(n)}, (3.9) 

N 

(Rfl,Rfl') R = ^trace^^:)- (3-10) 

i=l 

The gradient of the log- likelihood function L (|2.5[) . gradL(R), is a tangent vector in Tj^"P. The 
i th component of this gradient, that is, the gradient of the mapping Ri 1— > L(R) with Rj=a fixed, 
is a vector field on SO(n) which can be written as: 

grad, L(R) = ^ grad (x ^ log /, ; ; // ; .V ■) (Ri) (3.11) 

= E , m \ AT, g rad { x » MH ii R 3 x T j) m (3-12) 
E [gradlog/y(^. J R jJ RT)] T ^4. (3.13) 

3.1 Anchored case 

In specific applications, we may know some of the rotation matrices Ri. Let A c [N] be the set of 
indices of known rotations, called anchors. The associated parameter space 

V A = {R = (Ri, ...,R N )eV:Vi£A,Ri = R t } (3.14) 

is a Riemannian submanifold of V . The tangent space at R £ V a is given by: 

T A V A = {H e T A V : Vi € A, fli = 0}, (3.15) 

such that the orthogonal projector : T-^P —> T^/Pa simply sets to zero all components of a 
tangent vector that correspond to anchored rotations. All tools on V a (exponential and logarithmic 
map for example) are inherited in the obvious fashion from V . In particular, the Riemannian 
distance on Va is: 



i£A 



dist 2 (R,R) = J2 \\ l og(RjR'i) ■ (3.16) 
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Anchor-free case 



When no anchors are provided, the rotations can only be recovered up to a common rotation, since 
the measurements only provide information regarding relative rotations. More precisely, the distri- 
bution of the measurements Hij (|2.3|) is the same whether the true rotations are (Ri, ■ ■ ■ , Rn) or 
(RiQ, . . . , RnQ), regardless of Q G SO(n). Consequently, the measurements contain no informa- 
tion as to which of those sets of rotations is the right one. This leads us to define the equivalence 
relation 

(R U ...,R N ) ~ {R[,...,R' N ) & 3Q G SO(ra) : R l = R[Q for i = 1,...,N. (3.17) 

This equivalence relation partitions V into equivalence classes, often called fibers. The quotient 
space (the set of equivalence classes) 

V $ ±V/~, (3-18) 

is again a smooth manifold (in fact, V% is a coset manifold because it results from the quotient of 
the Lie group V by a closed subgroup oiV [28|, Prop. 11.12]). The notation reminds us that the set 
of anchors A is empty. Naturally, the log-likelihood function L is constant over equivalence classes 
and hence descends as a well-defined function on V%. Each fiber 

[R] = {{RiQ, . . . , R N Q) ■ Q G SO(n)} G V % (3.19) 

is a Ricmannian submanifold of the total space V . As such, at each point R, the fiber [R] accepts a 
tangent space that is a subspace of Tr/P. That tangent space to the fiber is called the vertical space 
at R, noted Vr. Vertical vectors point along directions that are parallel to the fibers. Vectors 
orthogonal, in the sense of the Ricmannian metric (|3. 10[) , to all vertical vectors form the horizontal 
space Hr = (Vr) , such that the tangent space TrP is equal to the direct sum Vr © Hr. 
Horizontal vectors are orthogonal to the fibers, hence point toward the other fibers, i.e., the other 
points on the quotient space V%. 

Because V% is a coset manifold, the projection 

7T : V -> V% : R i-> tt(R) = [R] (3.20) 

is a submersion. That is, the restricted differential D7t|h r is a full-rank linear map between Hr 
and TjRj'Pg. Practically, this means that the horizontal space Hr is naturally identified to the 
(abstract) tangent space TjrjTV This results in a practical means of representing abstract vectors 
of T[R]'p0 simply as vectors of Hr C TrP, where R is any arbitrarily chosen member of [R]. Each 
horizontal vector £r is unambiguously related to its abstract counterpart £[rj in T^jPg via 

£ [R] =Dtt(R)[£r]. (3.21) 

The representation £r of £[rj is called the horizontal lift of £[rj at R, a notion made precise in 0, 
§ 3.5.8] and depicted for intuition in [4(J Fig. 1-2] — these figures are also reproduced in 0, Fig. 2-3]. 

Consider ^rj and Wr], two tangent vectors at [R]. Let £r and r/R be their horizontal lifts at 
R G [R] and let £r' and 7?r/ be their horizontal lifts at R' G [R]. The Ricmannian metric on 
V (|3.10p is such that (£r,?7r) r = (£r',?7r')r'- This motivates us to define the metric 

(f[R],T7[R]) [R] = (£r,Wr ( 3 - 22 ) 

on V®, which is then well defined (it does not depend on our choice of R in [R]) and turns the 
restricted differential D7r(R) : Hr — > TjrjT^ into an isomctry. This is a Ricmannian metric and it 
is the only such metric such that 7r p.20[) is a Riemannian submersion from V to V$ [l7l Prop. 2.28]. 
Hence, V® is a Riemannian quotient manifold of V . 

We now describe the vertical and horizontal spaces of V w.r.t. the equivalence relation (|3.17l) . 
Let R G V and Q : K -> SO(n) : t i-> Q(t) such that Q is smooth and Q(0) = I. Then, the 
derivative Q'(0) = fi is some skew-symmetric matrix in so(n). Since (RiQ(t), . . . , Rj^Q(t)) G [R] 
for all t, it follows that f t (RiQ(t), . . . , R N Q(t))\ t=0 = (Riil, ■ ■ -,Rn^) is a tangent vector to the 
fiber [R] at R, i.e., it is a vertical vector at R. All vertical vectors have such form, hence: 



V R = {(R&, . . . , R N Ct) : n G so(n)}. 



(3.23) 



A horizontal vector (iZifii, . . . , R^^In) € Hr is orthogonal to all vertical vectors, i.e., Vf2 € so(n), 

AT / N \ 

= ((iiifii, . . . , ifofijv), (iZifi, • • • , fljvfi)) = J! trace((J2 i n i ) T R i n) = / Hi, « ) • (3-24) 

i=l \t=l / 

Since this is true for all skew-symmetric matrices f2, we find that the horizontal space is defined 
as: 

jv 

Hr = {(Rifl!, . . . , R N tt N ) :Oi,...,fijv eso(n) and =0}. (3.25) 

i=l 

This is not surprising: vertical vectors move all rotations in the same direction, remaining in the 
same equivalence class, whereas horizontal vectors move away toward other equivalence classes. 

We now define the logarithmic map on T$. Considering two points [R], [R] <G Vq, the logarithm 
Logr R i([R]) is the smallest tangent vector in TpjPg that brings us from the first equivalence class 
to the other through the exponential map. Working with the horizontal lift representation 

D^(R)| H ^[Log [R] ([R])] = (R 1 n 1 ,...,R N Q N ) G H R , (3.26) 

the fVs are skew-symmetric matrices solution of: 

min ||O x ||2 + ... + \\n N f F , (3.27) 

such that R % cxp(ft 4 ) = 7? t Q, i = l...N, and (3.28) 
«! + ■■• + n N = 0. (3.29) 

The rotation Q sweeps through all members of the equivalence class [R] in search of the one closest 
to R. By substituting = log^Rj^Q) in the objective function, we find that the objective value 
as a function of Q is J2i=i n P s(-^i~A < 2)llF- Critical points of this function w.r.t. Q verify 
£i=i ^» = 0: hence we need not enforce the last constraint: all candidate solutions are horizontal 
vectors (which is not surprising). Summing up, we find that the squared Ricmannian distance on 
7> obeys: 

N 2 

dist 2 ([R], [R]) = min V \\\og(RjR,Q) . (3.30) 

QGSO(ti)^-' II f 

Since SO(n) is compact, this is a well-defined quantity. Let Q G SO(ra) be one of the global 
minimizcrs. Then, an acceptable value for the logarithmic map is 

D7r(R)| H ^[Log [R] ([R])] = (i2ilog(i^R 1 Q) > ... > ^log(iiJfl J>r Q)) . (3.31) 

Under reasonable proximity conditions on [R] and [R], the global maximizer Q is uniquely defined, 
and hence so is the logarithmic map. Appendix [A] details uniqueness conditions and algorithms 
to compute Q, in connection with the literature about averaging on SO(n). For the sake of 
completeness, the same appendix describes embedded distances too. 



4 Measures, integrals and distributions on SO(n) 

The special orthogonal group SO(n), being a compact Lie group, admits a unique bi-invariant Haar 
measure jj, such that //(SO(n)) = 1 [6j, Thm3.6, p. 247]. Such a measure verifies, for all measurable 
subsets S C SO(n) and for all L,R G SO(n), that fi(LSR) = fi{S), where LSR = {LQR : Q G 
S} C SO(ra), that is, the measure of a portion of SO(n) is invariant under left and right actions of 
SO(n). We will need something slightly more general: 

Lemma 4.1 (extended bi-invariance). WL,R G O(n) such that det(LR) = 1, WS C SO(n) mea- 
surable, jjL^LSR) = n(S) holds. 
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Proof. LSR is still a measurable subset of SO(rc). Let // denote the Haar measure on 0(n) D 
SO(n). The restriction of // to the measurables of SO(n) is still a Haar measure. By the uniqueness 
of the Haar measure up to multiplicative constant, there exists a > such that for all measurable 
subsets T C SO(ra), we have n(T) = a(i'(T). Then, fi(LSR) = a^'{LSR) = a(j,'(S) = fj,(S). □ 

From the general theory of Lebesgue integration, we get a notion of integrals over SO(n) 
associated to the measure (i. Lemma 14.11 then translates into the following property, with / : 
SO(n) -^lan integrable function: 

WL,Re 0(n) s.t. det(L-R) = 1, / f{LZR)&n{Z)= [ f{Z)d^{Z). (4.1) 

JSO(n) JS0(n) 

This property will play an important role in the sequel. 

Functions / that are invariant under conjugation, meaning that for all Z, Q G SO(n) we have 
f(Z) = f{QZQ~ 1 ), are class functions. When the integrand in J so ^ f (Z) d^j,(Z) is a class 
function, we are in a position to use the Weyl integration formula specialized to SO(ra) [jj Exer- 
cise 18.1-2]. All spectral functions (Assumption [3]) are class functions (the converse is also true for 
SO(2fc + 1) but not for SO(2fc)). Weyl's formula comes in two flavors depending on the parity of n, 
and essentially reduces integrals on SO(n) to integrals over tori of dimension \ n/2\ . In particular, 
for n = 2 or 3, Weyl's formula for class functions / reads: 




(4.2) 



/ f(Z)dn(Z) = — / sin 9 cosO 1 (1 - cos0)d0, 
^SO(3) 2tt I n 1 / 

For n = 4, Weyl's formula is a double integral: 

tl7 \ A 1 r t ( A - //cos 0i -sin0A /cos0 2 -sin0 2 

!0(4 /^ d ^ = wU/riU 6l COS0! J'Un^ 2 cos0 2 

x | e <0i - e ^|2 . | e ifl! _ e —f 2 |2 d6 i ld6 ) 2 . (4.3) 

Once SO(n) is equipped with a measure fj, and accompanying integral notion, we can define 
distributions of random variables on SO(n) via probability density functions (pdf). In general, a 
pdf on SO(n) is a nonncgative measurable function / on SO(n) such that /g ( n ) f(Z) d[i(Z) = 1. 
In this work, for convenience, we further assume pdf 's are smooth and positive (Assumption [TJ , as 
we will need to compute the derivatives of their logarithm. 

Example 1 (uniform). The pdf associated to the uniform distribution is f(Z) = 1, since we 
normalized the Haar measure such that /i(SO(?i)) = 1. We write Z ~ Uni(SO(?i)) to mean that 
Z is a uniformly distributed random rotation. A number of algorithms exist to generate pseudo- 
random rotation matrices from the uniform distribution fill . §2.5.1] fldi l. Possibly one of the 
easiest methods to implement is the following 0(n 3 ) algorithm, adapted from flb\ . Method A, p. 22] 
with implementation details as in ]2a] (for large n, see the former paper for algorithms with better 
complexity): 

1. Generate A € W nxn , such that the entries Aij ~ A/"(0, 1) are i.i.d. normal random variables; 

2. Obtain a QR decomposition of A: QR = A; 

3. Set Q := Qdiag(sign(diag(i?))) (this ensures the mapping At-^Qis well-defined; see f2$l] ): 
4- Q is now uniform on 0(n). If dct(Q) = — 1, permute columns 1 and 2 of Q. Return Q. 

Example 2 (isotropic Langevin). The isotropic Langevin distribution on SO(n) with mean Q £ 
SO(n) and concentration k > has pdf 

f(Z) = — ^-cxp(Ktracc(Q T Z)), (4.4) 
c„(k) 
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where c„(k) is a normalization constant such that f has unit mass. We write Z ~ Lang(Q, k) to 
mean that Z is a random variable with pdf (|4.4p . For n = 0, Z ~ Uni(SO(n)); m t/ie Zirmt k — > oo, 
Z = Q w.p. 1. The isotropic Langevin distribution has a Gaussian-like shape. The Langevin pdf 
centered around Q = I is a spectral function, i.e., it fulfills Assumption^ 

The larger the concentration parameter k, the more the distribution is concentrated around the 
mean Q. By bi-invariance of fx, c„(«) is independent of Q: 

c n (n) = / exp(«;trace(Q T ^)) dfi(Z) = / exp(« trace(Z)) d/u(Z). (4.5) 

JSO(n) JSO(n) 

Since the integrand is a class function, Weyl's integration formulas apply for any value of n. 
Using (|4.2|) and (|4.3p . we work out explicit formulas for c„(k), n = 2, 3,4: 

c 2 (k)=/o(2k), (4.6) 
c 3 (k) = exp(K)(7 (2K) - J 1 (2/t)) ) (4.7) 
ci(«) = I (2k) 2 - 2/ 1 (2k) 2 + J (2/s) J 3 (2/s), (4.8) 

m terms of the modified Bessel functions of the first kind, J„ fsdl l. See Avvendix[B\ for details. 

For n = 2, the Langevin distribution is also known as the von Mises or Fisher distribution on the 
circle The Langevin distribution on SO(n) also exists in anisotropic form fTl /. Unfortunately, 
the associated pdf is no longer a spectral function even for Q = I , which is an instrumental property 
in the present work. Consequently, we do not treat anisotropic distributions for now. Chikuse gives 
an in-depth treatment of statistics on the Grassmann and Stiefel manifolds Fj3 /. including a study 
of Langevin distributions on SO(n) as a special case. 

Based on a uniform sampling algorithm on SO(n), it is easy to devise an acceptance-rejection 
scheme to sample from the Langevin distribution \lA . §2.5.2]. Not surprisingly, for large values of 
k, this tends to be very inefficient. Chiuso et al. report using a Metropolis-Hastings-type algorithm 
instead Yvk . §7/. Hoff describes an efficient Gibbs sampling method to sample from a mo re g eneral 
family of distributions on the Stiefel manifold, which can be modified to work on SO(n) \19] . 

For Lang(J, k), the following formulas will prove useful: 

log/(Z) = Ktrace(Z) - logc„(/t), (4.9) 
Vlog/(Z) = / t J„, (4.10) 
gradlog/(Z) = P z (Vlog/(Z)) = -nZsk{Z) , (4.11) 

where V denotes the usual gradient in the embedding space W nxn , grad denotes the gradient on 
SO(n) and Pz (|3-4[) is the orthogonal projector from the embedding space onto the tangent space 
TzSO(n), with sk(Z) = (Z — Z T )/2 the skew- symmetric part of Z — see Section^ 

The set of pdf's is closed under convex combinations, as is the set of functions satisfying our 
assumptions [1] and [31 We could therefore consider mixtures of Langevin distributions around I 
with various concentrations. In the next example, we combine the Langevin distribution and the 
uniform distribution. To the best of our knowledge, this is the first description of a heavy-tail-type 
distribution on SO(n). Such a distribution may prove useful for any application involving outliers 
in rotation estimation. 

Example 3 (isotropic Langevin with outliers) . We define the isotropic Langevin distribution with 
outliers on SO(?i) with mean Q € SO(rt), concentration K > and outlier probability 1 — p € [0, 1] 
via the pdf 

f(Z) = cxp( K tracc(Q T Z)) + (1 - p). (4.12) 

c„(k) 

For Z distributed as such, we write Z ~ LangUni(<5, K,p). For p = I, this is the isotropic Langevin 
distribution. For p = 0, this is the uniform distribution. Notice that f is a spectral function for 

Q = I. 

A random rotation sampled from LangUni(Q, K,p) is, with probability p , sampled from Lang(Q , k), 
and with probability 1—p, sampled uniformly at random. Measurements of Q distributed in such a 
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way are outliers with probability 1 — p, i.e., bear no information about Q w.p. 1 — p. For < p < 1, 
the LangUni is a kind of heavy-tailed distribution on SO(n). 
For LangUni(/, K,p), the following will prove useful: 

„,^n pK exp(/£ trace Z~) > . , , 

gradlog/ (Z ) = -£ ;\ , ; ZskZ, 4.13 

with sk(Z) = (Z — Z T )/2 denoting the skew- symmetric part of Z. 

To conclude this section, we remark more broadly that all isotropic distributions around the 
identity matrix have a spectral pdf. Indeed, let / : SO(ra) — > R be isotropic w.r.t. the geodesic 
distance on SO(n), dist(i?i, R 2 ) = ||log(i? ] r i? 2 ) || (|3.6p . That is, there is a function / such that 
f(Z) = /(dist(I,Z)) = /(j|logZ|| F ). It is then obvious that f(QZQ T ) = f(Z) for all Q G 
O(n) since log(QZQ T ) = Qlog(Z)Q T . The same holds for the embedded distance dist(i?i, R 2 ) = 
\\Ri — i?2 ||p- This shows that the assumptions proposed in Section [2] include many interesting 
distributions. 

Similarly we establish that all spectral pdf's have zero bias around the identity matrix /. The 
bias is the tangent vector (skew-symmetric matrix) fi = E{Log 7 (Z)}, with Z ~ /, / spectral. 
Since Logj(Z) = log(Z) (|3.5[) . we find, with a change of variable Z := QZQ T going from the first 
to the second integral, that for all Q <G 0(n): 

n= / log(Z) f(Z)d l i(Z) = [ \og(QZQ T ) f(Z)dfi(Z) = QflQ T . (4.14) 

^SO(n) JSO(n) 

Indeed, \og(QZQ T ) = Qlog(Z)Q T . Since skew-symmetric matrices are normal matrices and since 
O and fl T = —fl have the same eigenvalues, we may choose Q £ 0(n) such that QflQ T = — SI. 
Therefore, 17 = — SI = 0. As a consequence, it is only possible to treat unbiased measurements 
under the assumptions we make in this paper. 



The Fisher information metric for synchronization 

As described in Section [2j the relative rotation measurements = Z i ^R i Rj (|2.3|) reveal informa- 
tion about the true (but unknown) rotations R\, . . . , Rn- The Fisher information matrix (FIM) 
we compute here encodes how much information these measurements contain on average. In other 
words, the FIM is an assessment of the quality of the measurements we have at our disposal 
for the purpose of estimating the sought parameters. The FIM will be instrumental in deriving 
Cramer-Rao bounds in the next section. 

The FIM is a standard object of study for estimation problems on Euclidean spaces. In the 
setting of synchronization of rotations, the parameter space is a manifold and we thus need a more 
general definition of the FIM. We quote, mutatis mutandis, the definition of FIM as stated in Q 
following 



Definition 1 (FIM). Let V be the parameter space of an estimation problem and let 6 € V be 
the (unknown) parameter. Let f(y; 9) be the pdf of the measurement y conditioned by 0. The log- 
likelihood function L : V — > K is L(Q) = log/(y; 0). Let e = {ei, . . . , e^} be an orthonormal basis 
of the tangent space TqV w.r.t. the Riemannian metric (•,•)§. The Fisher information matrix of 
the estimation problem on V w.r.t. the basis e is defined by: 

F< i =E{(gradL(fl) J e i > 9 -( g radi(fl),e J -> 9 }. (5.1) 

Expectation in taken w.r.t. the measurement y. 

Hence, we need the gradient of the log-likelihood function L evaluated at the true rotations 
ReP. The i th component of that gradient is equal to (see (|3.13p ): 

grad, L(R) = ^ [gradlog f ij (Z ij )] T Z^IU. (5.2) 
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The vector field grad log /y on SO(n) may be factored into: 

gradlog f ij (Z) = ZGj j (Z), (5.3) 

where Gij : SO(?i) i— > so(n) is a mapping that will play an important role in the sequel. In 
particular, the i th gradient component now takes the short form: 

grad, L(R) = ^ G t] (Z t] )R t . (5.4) 

Let us consider a canonical orthonormal basis of so(n): (Ei, . . . ,Ed), where d = dimso(n) = 
(")■ For n = 3, we pick this one: 



Ex = -= -1 , E 2 = -= , £ 3 = -7= 1. (5.5) 






An obvious generalization yields similar bases for other values of n. We can transport this canonical 
basis into an orthonormal basis for the tangent space Tfl i SO(n) as (R4E1, . . . , RiEd). To compute 
the Fisher information matrix, we need an orthonormal basis for the tangent space Tr"P to the 
manifold V = SO(n) x • ■ ■ x SO(n). This matrix, of dimension dN x dN, is composed of N x N 
blocs of size d x d. Let us index the (k,£) entry inside the (i,j) bloc as Fij^i- Let us also fix an 
orthonormal basis for the tangent space at R = (R\, . . . , Rn) of V, as 

{£ik)i=i...N,k=i...d, with £ ifc = (0, ...,0, i?jE fc ,0, . . . ,0), 

a zero vector except for the i th component equal to RiE^- (5.6) 

Accordingly, the matrix F at R is defined by (sec Definition [T]): 

F ijM = E{<gradZ(R),£ ifc ) • (grad£(R),&<}} 

= E {(grad, L(R), R.Eu) • (grad,- L(R), 

= EE E {(G, r (Z jr ), i^irf) • <G ja (Z j4 ), RjE e R])} . (5.7) 

reV; sGV,- 

We prove that, in expectation, the mappings Gij : SO(n) — > so(n) are zero. This fact is 
directly related to the standard result from estimation theory stating that the average score V(0) = 
E {grad log f(y; 9)} for a given parameterized probability density function / is zero. 

Lemma 5.1. Given a probability density function f : SO(n) — > M + and the mapping G : SO(n) — > 
5o(n) such that gradlog/(Z) = ZG(Z), we have that E{G(Z)} = 0, where expectation is taken 
w.r.t. Z, distributed according to f. 

Proof. Define h(Q) — J g0 („) f(ZQ) d/i(Z) for Q € SO(n). Since / is a probability density function, 
bi-invariance of fi (|4.ip yields h(Q) = 1. Taking gradients with respect to the parameter Q, we 
get: 

= grad^Q) = f grad Q /(ZQ) dp(Z) = f Z T gv & df{ZQ) dfi(Z). 

JSO(n) JSO(n) 

With a change of variable Z := ZQ, by bi-invariance of /x, we further obtain: 

/ Z T grad/(Z) dfi(Z) = 0. 

JSO(n) 

Using this last result and the fact that gradlog /(Z) = j^jgiadf(Z), we conclude: 

E{G(Z)} = / Z T gradlog/(Z) f(Z)dft(Z) = f Z T gra,df(Z) dfi(Z) = 0. □ 

JSO(n) JSO(n) 
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We now invoke Assumption [2] (independence). Independence of Z^j and Z pq for two distinct 
edges and (jp, q) implies that, for any two functions 4>\,4>2 '■ SO(n) — > R, it holds that 

EiMZi^MZpi)} = ^{MZij)}E{Mz Pq )}, 

provided all involved expectations exist. Using both this and Lemma 15.11 most terms in (|5.7[) 
vanish and we obtain a simplified expression for the matrix F: 



^ E {(Gir(Zir), RiE k Rj) ■ (Gi r {Z ir ),R i E t Rj)'), 



F, 



ij,k£ 



r6V, 



0, 



if i = j, 

ili^j and e£, (5.8) 
if i ^ j and (i, j) ^ £. 



We further manipulate the second case, which involves both Gij and Gji, by noting that those are 
deterministically linked. Indeed, by symmetry of the measurements (H^ = Hj^), we have that (i) 
Zji = RjRjZjjRjRj and (ii) fij(Zij) = fji(Zji). Invoking Assumption |3l since Zij and Zji have 
the same eigenvalues, it follows that fij(Z) = fji(Z) for all Z G SO(n). As a by-product, it also 
holds that Gij(Z) = Gji(Z) for all Z € SO(n). Still under Assumption [3J we show in Appendix [Dl 
that 



VQ e O(n), Gij(QZQ T ) = Q G V} {Z) Q? 
Combining these observations, we obtain: 



and G l] {Z T ) = -Gij(Z). 



(5.9) 



Gji(Zji) - Gij(Zji) - ("/, ,•: /.',/.', Z ij I{,I{ J :■ - /.',/.', C', ; 'Z, ; :■ //,/.', . (5.10) 

The minus sign, which plays an important role in the structure of the FIM, comes about via the 
skew-symmetry of Gij. The following identity thus holds: 

{GjiiZj^RjE^J) = - (G^Z^R^Rj) . (5.11) 

This can advantageously be plugged into (|5.8|) . 

We set out to describe the expectations appearing in (|5.8[) . which will take us through a couple 
of lemmas. Let us, for a certain pair (i, j) £ £, introduce the functions hk '■ SO(n) — > R, k = 1 . . . d: 



(5.12) 



where we chose to not overload the notation hk with an explicit reference to the pair as 
this will always be clear from the context. We may rewrite the FIM in terms of the functions hk, 
starting from (|5.8I) and incorporating (|5.11[) : 



^ ^{hk{Z lr ) ■ hi(Z ir )}, if i=j, 
F ijM = \ -EihkiZ^-hiiZij)}, 



if i ^ j and E £, 



(5.13) 



0, if i j=j and (i, j) £ £. 

Another consequence of Assumption [3] is that the functions h k (Z) and hi(Z) are uncorrelatcd for 
k =/= I, where Z is distributed according to the density fy. As a consequence, Fij^i = for k ^ £, 
i.e., the d x d blocks of F are diagonal. We establish this fact in Lemma [5751 right after a technical 
lemma. 

Lemma 5.2. Let E,E' G so(n) such that Eij = —Eji = 1 and E' kf = —E' ik = 1 (all other entries 
are zero), with (E,E'} = 0, i.e., {i,j} ^ {k,£}. Then, there exists P £ 0{n) a signed permutation 
such that P T EP = E' and P T E'P = -E. 

Proof. See Appendix [E] □ 
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Lemma 5.3. Let Z £ SO(n) be a random variable distributed according to fij. The random 
variables hk(Z) and hi(Z), k ^ £, as defined in (|5.12[) have zero mean and are uncorrelated, i.e., 
E{h k {Z)} = E{h e (Z)} = and E{h k (Z) ■ h e (Z)} = 0. Furthermore, it holds that E {h 2 k {Z)} = 
E{hj(Z)}. 

Proof. The first part follows directly from Lemma 15.11 We show the second part. Consider a 
signed permutation matrix P k i £ O(n) such that P kl E k P kl = E e and P k £E t P kl = —E k . Such a 
matrix always exists according to Lemma T5. 2 1 Then, identity (|5.9p yields: 

hkiR^Rj Z R t P k \Rj) = (Gij (Z), R^E^Rj) = h e (Z). (5.14) 

Likewise, 

heiR^Rj Z R^Rj) = -h k (Z). (5.15) 

These identities as well as the (extended) bi-invariancc (|4.1[) of the Haar measure /i on SO(n) and 
the fact that fij is a spectral function yield, using the change of variable Z := R t P ke Rj Z R^Rj 
going from the first to the second integral: 



E{h k (Z)-h e (Z)} = h k (Z)h e (Z) f l3 (Z)dfi(Z) (5.16) 

JSO(n) 

-h e {Z)h k {Z) f lJ (Z)d^(Z) = -E{h k (Z)-he{Z)}. (5.17) 

SO(n) 

Hence, E {h k (Z) ■ he(Z)} = 0. We prove the last statement using the same change of variable: 
E{h 2 k (Z)}= [ ht(Z) fnWMZ) = I t$(Z) f ij (Z)d»(Z)=E{h 2 i (Z)}. 

JSO{n) JSO(n) 

We note that, more generally, it can be shown that the h k 's are identically distributed. □ 

The skew-symmetric matrices (R i E 1 Rj, . . . , R^^J), d = dimso(n), form an orthonormal 
basis of the Lie algebra so(n). Consequently, we may expand each mapping Gij in this basis and 
express its squared norm as: 

d d 

Gij(Z)=J2hk(Z)-RiE k Rj, \\Gij{Z)f = Y,h 2 k {Z). (5.18) 

fe=i fc=i 

Since by Lemma 15.31 the quantity E {h\{Zij)\ does not depend on k, it follows that: 

®{hl(Z i j)} = ~E{\\G i j(Z ij )\\ 2 }, k = l...d. (5.19) 

This further shows that the d x d blocks that constitute the FIM have constant diagonal. Hence, F 
can be expressed as the Kronecker product ((g)) of some matrix with the identity Id- Let us define 
the following (positive) weights on the edges of the measurement graph: 

Wij = wn ^ElllGy^OH 2 } ^E{||gradlog/. y (Z„)|| 2 } , V(i,j) £ £. (5.20) 

Then, the weighted Laplacian matrix C £ M. NxN , C = C T y 0, of the weighted measurement graph 
is given by: 

-Wij, iti^j and (i,j) £ £, (5.21) 

0, if ijLj and (££. 

It is now apparent that the matrix F £ ^ dN><dN — se e equation (|5.13[) — is tightly related to the 
weighted Laplacian matrix C (|5.21|) . We summarize this in the following theorem. 
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Theorem 5.4 (FIM for synchronization). Let R\, . . . ,Rn G SO(n) be unknown but fixed rotations 
and let = Z^R^Rj for E £,i < j, with the Zij 's random rotations whose distributions 
fulfill Assumptions [7H21 We consider the problem of estimating the Ri 's given a realization of the 
Hij 's. The Fisher information matrix (Definition^ of that estimation problem with respect to the 
basis (|5.6[) is given by 

F=^(C®I d ), (5.22) 

where <8> denotes the Kronecker product, d = dimSO(n) = n{n — l)/2, I& is the d x d identity 
matrix and C is the weighted Laplacian matrix of the measurement graph defined by (|5.21[) . 

The Laplacian matrix has a number of properties, some of which will yield nice interpretations 
when deriving the Cramer-Rao bounds. One remarkable fact is that the Fisher information matrix 
does not depend on R = (R\, . . . , Rn), the set of true rotations. This is an appreciable property 
seen as R is unknown in practice. This stems from the strong symmetries in our problem. 

Another important feature of this FIM is that it is rank deficient. Indeed, for a connected 
measurement graph, C has exactly one zero eigenvalue (and more if the graph is disconnected) 
associated to the kernel vector of all ones, 11 jy- The kernel of the FIM is thus composed of all 
vectors of the form l^v <£> t, with t £ M. d arbitrary. This corresponds to the vertical spaces of V 
w.r.t. the equivalence relation Q3.17p . i.e., the kernel consists in all tangent vectors that move all 
rotations Ri in the same direction, leaving their relative positions unaffected. This makes perfect 
sense: the distribution of the measurements is also unaffected by such changes, hence the FIM, 
seen as a quadratic form, takes up a zero value when applied to the corresponding vectors. We 
will need special tools to deal with this (structured) singularity when deriving the CRB's in the 
next section. 

Notice how Assumption [2] (independence) gave F a bloc structure based on the sparsity pattern 
of the Laplacian matrix, while Assumption [3] (spectral pdf's) made each bloc proportional to the 
d x d identity matrix and made F independent of R. 

In the two following examples, we explicitly compute the weights Wij (|5.20p associated to 
two types of noise distributions: (1) unbiased isotropic Langevin distributions (akin to Gaussian 
noise), and (2) a mix of the former distribution and complete outliers (uniform distribution) — 
see Section [4] Combining formulas for the weights and equations (|5.21[) and (|5.22[) , one can 
compute the Fisher information matrix explicitly. 

Example 4 (Langevin distributions). (Continued from Example^) Let us consider the isotropic 
Langevin distribution over SO(n) , with concentration parameter k > and mean at the identity 
matrix: f(Z) = - ^ exp(KtraceZ). From the expression of gradlog/(Z) (|4.11[) . we find that 

G(Z) = Ksk(Z) = %(Z — Z ). Hence, the weight w associated to the noise distribution f is a 
function a n (n) given by: 

w = a n (K,)=E{\\G(Z)\\ 2 } f \\Z-Z T f f(Z)d(M(Z). (5.23) 

4 JSO(n) 

Since the integrand is again a class function, we may evaluate this integral using Weyl's inte- 
gration formulas — see Section ^ and Appendix \B\ for an example. In particular, for n — 2 or 3, 
identities (|4.2[) apply and we derive the following expressions: 

, x Ii(2k) , \ k (2 - k)Ii(2k) + kI 3 (2k) 

The functions I v {z) are the modified Bessel functions of the first kind (|B.8[) . We used the for- 
mulas for the normalization constants C2(k) (|4.fj[) and c^{n) (|4.7[) as well as the identity Ii(2k) = 
k(/ (2k)-/ 2 (2k)). 

For the special case n = 2, taking the concentrations for all measurements to be equal to K, we 
find that the Fisher information matrix F is equal to the weighted Laplacian matrix C, which is 
proportional to the unweighted Laplacian matrix £q = D — A, with D the degree matrix and A the 
adjacency matrix of the measurement graph: 

F = C = a 2 (K)C . (5.25) 
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This particular result was shown before via another method in For the derivation in the latter 

work, commutativity of rotations in the plane is instrumental, and hence the proof method does 
not — at least in the proposed form — transfer to SO(n) for n > 3. 

Example 5 (Langevin/outlier distributions). (Continued from Example^ Given a concentration 
K > and a probability p 6 [0, 1] , the p.d.f. 

f(Z) = — ^— cx P (k trace Z) + 1 -p (5.26) 
c„(k) 

corresponds to picking a rotation, with probability p, around the identity matrix following an 
isotropic Langevin distribution with concentration k, or, with probability 1—p, uniformly at random 
anywhere on SO (n). This pdf satisfies all our assumptions. The formula for gradlog/(Z) (|4.13|) 
yields G(Z) = VK cx P(^ acc - z ) sk(Z). We thus need to evaluate 

( \ f 1 f P K cxp(KtraccZ)\ 2 T 2 

w = a n (n,p)= TT^V 7T~ Tl \Z-Z\\dfi(Z). 5.27 

isO(n) f( Z ) V 2c «( K ) / 

The integrand being a class function again, Weyl's formula applies and after some algebra using 
the material in Section^ we obtain formulas a n {n,p) for n = 2 or 3: 

, . (pk) 2 1 r (l-cos20)exp(4«;cos0) ,„ 
C2{K) n J pcxp{2ncos9) + (1 - p)c 2 {n) 

_ {pnf cxp(2 K ) 1 f v (1 - cos 28)(l - cos 6) cxp(4 K cos 9) 
a3[ ^ P) - C3 (k) Trio pexp( K (l + 2cos0)) + (l-p)c 3 ( K ) dW - ^ 

These integrals may be evaluated numerically. The same machinery goes through for n > 4. 



Cramer-Rao bounds for synchronization 

Classical Cramer-Rao bounds (CRB's) give a lower bound on the covariance matrix C of any 
unbiased estimator for an estimation problem in R™. In terms of the Fisher information matrix F 
of that problem, the classical result reads C >z F~ x . In our setting, an estimation problem on a 
manifold with singular F, we need to resort to a more general statement of the CRB. 

First off, because our parent parameter space V is a manifold instead of a Euclidean space, we 
need generalized definitions of bias and covariance of estimators — we will quote them momentarily. 
And because manifolds are usually curved — as opposed to Euclidean spaces which are flat — the 
CRB takes up the form C >z F^ 1 + curvature terms j36|. We will compute the curvature terms 
and show that they become negligible at large signal-to-noise ratios (SNR). 

Secondly, owing to the indeterminacies in our estimation problem (based on the -ffy's only, 
we can only recover the Ri's up to a global rotation), the FIM (|5.22l) is singular, with a kernel 
that is identifiable with the vertical space (|3.23[) . In p}, CRB's are provided for this scenario by 
looking at the estimation problem cither on the submanifold V a (where indeterminacies have been 
resolved by fixing anchors) or on the quotient space V$ (where each equivalence class of rotations 
is regarded as one point). 

We should bear in mind that intrinsic Cramer-Rao bounds are fundamentally asymptotic 
bounds for large SNR (36|. At low SNR, the bounds may fail to capture features of the esti- 
mation problem that become dominant for large errors. In particular, since our parameter spaces 
V A and are compact, there is an upperbound on how badly one can estimate the true rotations. 
Because of their local nature (CRB's focus on small error scenarios), the bounds we establish here 
are unable to capture this essential feature and should not be used in low SNR regimes. 

We now give a definition of unbiased estimator. After this, we differentiate between the an- 
chored and the anchor-free scenarios to establish CRB's. 

Definition 2 (unbiased estimator). Let V be the parameter space of an estimation problem. Let 
f(y;9) be the pdf of the measurement y G M. conditioned by 9 g V . An estimator is a mapping 
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6 : Ai — > V assigning a parameter 9{y) to every possible realization of the measurement y. The 
bias of an estimator is the vector field b on V: 

yeev, b(0)^=E{Lo ge (!%))}. (6.1) 

An unbiased estimator has a vanishing bias b = 0. 



Anchored synchronization 

When anchors are provided, the rotation matrices Ri for i 6 A, A ^ are known. The parameter 
space then becomes Va (|3.14[) . which is a Riemannian submanifold of V. The synchronization 
problem is well-posed on Va, provided there is at least one anchor in each connected component 
of the measurement graph. 

We set out to use [7J, Theorem 4], which states Cramer- Rao bounds for estimation problems 
on Riemannian submanifolds. To this end, we first define the dN x dN covariance matrix Ca for 
anchored synchronization, then we introduce a projected FIM Fa, and we finally state the bound. 

Definition 3 (anchored covariance). Following f^, eq. (2)], the covariance matrix of an estimator 
R mapping each possible set of measurements Hij to a point in Va, expressed w.r.t. the orthonormal 
basis (|5.6[) ofTnV, is given by: 

{C A ) ijM = E{(Log R (R),6fc) • (lo 8r (R),^)} , (6.2) 

where we used the same indexing convention as for the FIM. Of course, all d x d blocks (i,j) such 
that either i or j or both are in A are zero by construction. In particular, the variance of R is the 
trace of Ca: 

v&r A = trace Ca = E |||Log R (R)|| 2 | = E |dist 2 (R, R)| , (6.3) 

where dist is the geodesic distance on Va (|3.16l) . 

In order to use Theorem 4], we need to compute Fa — PaFPa, where Pa is the orthogonal 
projector from TrP to TrPa, expressed w.r.t. the orthonormal basis (|5.6[) . The effect of Pa is 
to set all rows and columns corresponding to anchored rotations to zero. Thus, we introduce the 
masked Laplacian Ca'- 

(£a)h = < n J . (6.4) 

otherwise. 



Then, the projected FIM is simply: 



F A = -XC A ®I d ). (6.5) 
d 



The pseudoinverse of Fa is given by F\ = d(C\ ® Id), since for arbitrary matrices A and B, it 
holds that (A®B) t = At® fit Prop. 6]. Notice that the rows and columns of C\ corresponding 
to anchors are also zero. The aforementioned theorem 0, Thm. 4] then yields the sought CRB: 

Ca <Z d(C A ® Id) + curvature terms. (6-6) 

We compute the curvature terms explicitly in Appendix [C] and show that they are on the order of 
SNR -2 . They can hence be neglected for large SNR, which makes for easier formulas. In particular, 
for n = 2, the manifold Va is flat. Hence, the curvature terms vanish exactly and the CRB takes 
up the form: 

C A h O a . (6.7) 
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Figure 1: The Cramer-Rao bound for anchored synchronization (|6.10l) limits how well each in- 
dividual rotation can be estimated. We illustrate the effect of anchors on the two identical syn- 
chronization graphs above, where all edges have the same weight (i.i.d. noise). Anchors are red 
squares. Unknown rotations are the blue round nodes. The area of node i is proportional to the 
lower-bound on the average error for this node E |dist 2 (i?;, On the left, there is only one 

anchor in the upper-left cluster. Hence, nodes in the lower-left cluster, which are separated from 
the anchor by two bottlenecks, will be seriously harder to estimate accurately than in the situation 
on the right, where there is one anchor for each cluster. This (obvious) fact is supported by precise, 
quantified bounds, making it possible to automate network analysis and design for large graphs. 

For n = 3, including the curvature terms yields this CRB: 

C A h 3 U A - \ (ddiag(4)4 + 4ddiag(4)) \ ® I 3 , (6.8) 

where ddiag sets all off-diagonal entries of a matrix to zero. The curvature terms are thus on 
the order of 0(X^ aax (C' A )), which indeed becomes small at large SNR. For general n, neglecting 
curvature for n > 3, the variance is lower-bounded as follows: 

var A = E jdist 2 (R,R)j > d 2 trace/^, (6.9) 

where dist is as defined by p,16p . It also holds for each node i that 

Ejdist 2 ^,^)} > d 2 (C A ) U - (6.10) 

This leads to a useful interpretation of the CRB depicted in Figure \T\ 

Interestingly, for synchronization on the group of translations R™, the CRB involves the same 
Laplacian- related matrix Q. Barooah and Hespanha give interpretations of this bound in terms 
of the resistance distance on the measurement graph when there is exactly one anchor This 
analogy is notably useful in verifying important properties of the bound, such as its behavior when 
edges or nodes are added or deleted, etc. 

Anchor-free synchronization 

When no anchors are provided, the rotation matrices Ri can only be recovered up to a global 
rotation, leading to an equivalence relation (|3.17p on V, which in turn leads us to work on the 
Riemannian quotient parameter space V% p,18|) . The synchronization problem is well-posed on 
V% as long as the measurement graph is connected, which we always assume in this work. 
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We set out to use 0, Theorem 5] , which states Cramer- Rao bounds for estimation problems on 
Ricmannian quotient manifolds. To this end, we first define the dN x dN covariance matrix 
for anchor-free synchronization, then we state the bound. 

Definition 4 (anchor-free covariance) . Following 0, eg. (25)], the covariance matrix of an estima- 
tor [R] mapping each possible set of measurements Hij to a point in V§ ( that is, to an equivalence 
class inV), expressed w.r.t. the orthonormal basis (|5.6p o/TrT 5 , is given by: 

{C % ) ijM =E{(t,tik) ■ , with (6.11) 

^=(D7r(R)| HR )- 1 [Log [R] ([R])]. (6.12) 

That is, £ (the random error vector) is the shortest horizontal vector such that Exp R (£) € [R]. 
We used the same indexing convention as for the FIM. In particular, the variance of [R] is the 
trace of C : 

var = traceC* = E |||Log [R] ([R])|| 2 | = E |dist 2 ([R], [R])| , (6.13) 

where dist is the geodesic distance on V% (|3.30|) . 

Theorem 0, Thm. 5] then yields the sought CRB: 

C y d(C^ ® Id) + curvature terms. (6-14) 

We compute the curvature terms explicitly in Appendix [C] and show they can be neglected for 
(reasonably) large enough SNR. In particular, for n = 2, the manifold V% is flat. Hence: 

C h £ f . (6.15) 

For n = 3, the curvature terms are the same as those for the anchored case, with an additional 
term that decreases as 1/N. For (not so) large N then, the bound (|6.8[) is a good bound for 
n = 3, anchor-free synchronization. For general n, neglecting curvature for n > 3, the variance is 
lower-bounded as follows: 

var = E | dist 2 ([R], [R])| > d 2 trace/: 1 , (6.16) 

where dist is as defined by (|3.30[) . 

For the remainder of this section, we work out an interpretation of (|6.14j) . This matrix inequality 
entails that, for all x £ M. dN (neglecting curvature terms if needed): 

x T C $ x > dx T (C f ® I d )x. (6.17) 

As both the covariance and the FIM correspond to positive semidefinite operators on the horizontal 
space Hr, this is really only meaningful when x is the vector of coordinates of a horizontal vector 
r) = (rjx, . . . , j/jv) € Hr.. Let e^, ej denote the i-th and j-th columns of the identity matrix In and 
let efe denote the fc-th column of Id- We consider x — (e^ — e 3 ) (g) e^, which corresponds to the 
zero horizontal vector r) except for r\i = RiEk and r)j = —RjEk, with Ek G so(n) the fc-th element 
of the orthonormal basis of so(n) picked as in (|5.5[) . By definition of C% and of the error vector 

I = {R 1 n u . . . , R N n N ) e h r 4H2), 

x T C x = E{(^,7 ? ) 2 } =E{(n i -n j ,E k ) 2 }. (6.18) 
On the other hand, we have 

dx T (C Jf <g> I d )x = d{ei- e J ) T C\e l - e 3 ). (6.19) 

These two last quantities are related by inequality (|6.17[) . Summing for k = 1 . . . d on both sides 
of this inequality, we find: 



E 



{||fii - iljWl } > d 2 (e 4 - ej) T tf(ei - e 3 ). (6.20) 
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Now remember that the error vector £ (|6.12[) is the shortest horizontal vector such that Exp R (£) £ 
[R]. Without loss of generality, we may assume that R is aligned such that Exp R (£) = R. Then, 
Ri = Ri exp(f2j) for all i. It follows that 

fl i AT=i? i exp(n i )exp(-n i )iiT ) hence (6.21) 
dist^ii^T R.RJ) = 1 1 log (exp(a i ) exp(-Slj)) || F . (6.22) 

For commuting Oj and Qj — which is always the case for n = 2 — we have 

log (exp(f2i) exp(— f2 3 )J = Hi — Qj. (6.23) 

For n > 3, this still approximately holds at high enough SNR (that is, for small enough f2j,f2j), 
by the Baker-Campbell-Hausdorff formula. Hence, 

E{dist 2 (i? l i?J,i? i .Rj)} w E |||Oi - fijUpj > d 2 (e* - e J ) T £ t (e l - &,•)• (6.24) 

The quantity (e, — e^ ) C'(ej — e^) is sometimes called the squared Euclidean Commute Time 
Distance (ECTD) [32j between nodes i and j (up to a normalization constant). It is also known 
as the electrical resistance distance. It is rich in interpretation: the ECTD between two nodes 
is inversely proportional to the ease of communication between these two nodes. It decreases 
whenever the number of paths between them increases or whenever an existing path is made more 
informative, i.e., weights on that path are increased. Put this in perspective considering that in 
our setting, weights are proportional to the quality of the measurements. 

Still in [32| , it is shown in Section 5 how the ECTD can be used to embed the nodes in the 
plane such that two nodes are close- by if they have small ECTD. For synchronization, such an 
embedding naturally groups together nodes whose relative rotations can be accurately estimated, 
as depicted in Figure^ 

The CRB for anchor-free synchronization thus sets a lower-bound on how well one can estimate 
the relative rotation between Ri and Rj . The bound reflects how well the corresponding nodes are 
connected in the information-weighted measurement graph. We emphasize that the anchor-free 
CRB does not say anything about any particular rotation. 



Comments on, and consequences of the CRB 

Neglecting curvature terms — which we showed is exact for n = 2 and legitimate for n > 3 at 
reasonably favorable SNR — the Cramer-Rao bounds yield simple lower-bounds on the variance of 
unbiased estimators for synchronization. The right-hand sides of these bounds could in turn be 
used as cost function in an effort to optimize network structure or anchor placement. We take a 
closer look at the anchor-free bound to determine what structural properties it is affected by. 
At large SNR, the anchor- free bound, normalized by the number of rotations N, reads (|6.16[) : 

E{MSE}^E|ldist 2 ([R],[R])| > ^ traced), ( 7 .1) 

where E{MSE} as defined is the expected mean squared error of an unbiased estimator [R]. This 
expression shows the limiting role of the trace of the pseudoinverse of the information- weighted 
Laplacian C (|5.21[) of the measurement graph. This role has been established before for other 
synchronization problems for simpler groups and simpler noise models 0, l2~0 ] . We now shed some 
light on this result by stating a few elementary consequences of it. We let 

= Ai < A 2 < •• ■ < X N (7.2) 

denote the eigenvalues of £, where A2 > means the measurement graph is assumed connected. 
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Figure 2: The Fisher information matrix for anchor- free synchronization is essentially the Laplacian 
of the measurement graph, whose edges have weights proportional to the amount of information 
these edges contain (|5.22p . As a result, the Cramer-Rao bound limits the accuracy to which the 
relative rotation between two nodes can be estimated in proportion to the Euclidean commute 
time distance (ECTD) separating them in the graph (|6.24p 32j. On the left, a synchronization 
graph: each node corresponds to a rotation to estimate, all edges correspond to measurements 
of relative rotations affected by i.i.d. noise, and hence have the same weight. Nodes are colored 
according to the Fiedler vector of the weighted graph. Node positions are irrelevant. On the right, 
ECTD-cmbcdding of the same graph in the plane, such that the distance between two nodes i 
and j in the picture is mostly proportional to the ECTD separating them, which is essentially a 
lower-bound on E |dist 2 (i?, ; i?J, R^Rj)^. In other words: the closer two nodes are, the better their 
relative rotation can be estimated. 



A larger Fiedler value is better. The right-hand side (rhs) of (|7.1[) in terms of the Aj's is 
given by (omitting the unimportant d 2 factor): 

N 

- traced) = £ _L < « (for large N) . (7.3) 

i=2 

The second eigenvalue A2 is known as the Fiedler value (or algebraic connectivity) of the information- 
weighted measurement graph. It is well known that the Fiedler value is low in the presence of bot- 
tlenecks in the graph and high in the presence of many, heavy spanning trees. The latter equation 
translates in the following intuitive statement: by increasing the Fiedler value of the measurement 
graph, one can force the rhs of the CRB (|7.ip down. Not surprisingly then, expander graphs are 
ideal for synchronization, since, by design, their Fiedler value A2 is bounded away from zero while 
simultaneously being sparse [20| . 

Notice that the Fiedler vector has zero mean (it is orthogonal to Ijv) and hence describes the 
horizontal vectors of maximum variance. It is thus also the first axis of the graph on the right in 
Figure [2J 



The CRB is an asymptotic bound. Intrinsic Cramer-Rao bounds are asymptotic bounds, 
that is, they are meaningful for small errors. In particular, in the present scenario the parameter 
spaces Va and V$ are compact, hence there is an upper-bound on the variance of any estimator. 
The CRB is unable to capture such a global feature, and hence for arbitrarily low SNR, the CRB 
will predict an arbitrarily large variance (and will be violated). As a means to locate the point 
at which the CRB certainly stops making sense, we consider the problem of estimating a rotation 
matrix R £ SO(n) based on a measurement Z £ SO(n) of R, and compute the variance of the 
(unbiased) estimator R(Z) := Z when Z is uniformly random, i.e., when no information is available. 
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Define V n = E {dist 2 (Z, R)} for Z ~ Uni(SO(n)). A computation using Weyl's formula yields 

«-5£lM«0ll>-5/>*- 2 r 



2 2?r 2 

^3 = — +4. (7.4) 



A reasonable upper-bound on the variance of an estimator should be N'V n , where N' is the number 
of independent rotations to estimate (N — 1 for anchor- free synchronization, N — \A\ for anchored 
synchronization). A CRB larger than this should be disregarded. 

trace(Z^) plays a limiting role in synchronization. The quantity trace(Z^) appears naturally 
in CRB's for synchronization problems on groups For complete graphs and constant weight 



, tracc(£t) = 1L=1 „ _L. Then, by dJ), 



E{MSE}>^. (7.5) 

If the measurement graph is sampled from a distribution of random graphs, trace(£^) becomes a 
random variable. We feel that the study of this random variable for various families of random 
graph models, such as Erdos-Renyi, small-world or scale-free graphs [22| is a question of interest, 
probably best addressed using the language of random matrix theory. 

Let us consider Erdos-Renyi graphs Gjv.g with N nodes and edge density q £ (0,1), that is, 
graphs such that any edge is present with probability q, independently from the other edges. Let all 
the edges have equal weight w. Let £iv,g be the Laplacian of a Gjv,g graph. The expected Laplacian 
is E{£at,,j} = wq(NlN — IatxAt), which has eigenvalues Ai = 0, A2 = • • • = Xn = Nwq. Hence, 
trace(E {£jv,g}^) = ^jt~~^- A more useful statement can be made using 0, Thm. 1.4] and [l^, 
Thm. 2]. These theorems state that, asymptotically as N grows to infinity, all eigenvalues of 
C-N,q/N converge to wq (except of course for one zero eigenvalue). Consequently (details omitted), 



lim tracc(£j V(J ) = — (in probability). (7-6) 
For large N, we use the approximation trace(£jy ) w 1/wq. Then, by (|7.1I) . for large N we have: 



d 2 

E{MSE}> . (7.7) 

wqN 

Notice how for fixed measurement quality w and density q, the lower-bound on the expected MSE 
decreases with the number N of rotations to estimate. 



Synchronization can withstand many ouliers. Consider the Langevin with outliers distribu- 
tion from Example [SJ where (on average) a fraction 1 — p of measurements are sampled uniformly 
at random, i.e., they are outliers. The information weight w(p) = a n (n,p) for some fixed con- 
centration k > is given by equations (|5.28[) and (|5.29D for n = 2 and 3 respectively. A Taylor 
development around p = shows that 

w(p) =a n , K p 2 +0(p 3 ) (7.8) 

for some positive constant a„ jK . Then, for p -C 1, building upon (|7.5p for complete graphs with 
i.i.d. measurements we get: 

E{MSE}>^— -. (7.9) 

If one needs to get the rhs of this inequality down to a tolerance e 2 , the probability p of a mea- 
surement not being an outlier needs to be at least as large as: 

Pe ± ~^A=- (7-10) 



23 



If p s > 1, synchronization cannot be solved to the desired accuracy. The 1/vTV factor is the most 
interesting: it establishes that as the number of nodes increases, synchronization can withstand 
more and more outliers. 

This result is to be put in perspective with the bound in [34l . eq. (37)] for n = 2, k = oo, 
where it is shown that as soon as p > l/y/N, there is enough information in the measurements (on 
average) for their eigenvector method to do better than random synchronization. It is also shown 
there that, as p 2 N goes to infinity, the correlation between the eigenvector estimator and the true 
rotations goes to 1. Similarly, we see that as p 2 N increases to infinity, the rhs of the CRB goes to 
zero. Our analysis further shows that the role of p 2 N is tied to the problem itself (not to a specific 
estimation algorithm), and remains the same for n > 2 and in the presence of Langevin noise on 
the good measurements. 

Building upon ALU) where it is stated that trace(£ t ) « 1/wq = TV (TV - l)/(2Mw) for Erdos- 
Renyi graphs with TV nodes and M edges, we define p £ for such graphs as: 



p -v^yii- (7 - u) 

To conclude this remark, we provide numerically computable expressions for a„_ K , n = 2 and 3 
and give an example: 

a 2K = o / \ I (1 - cos26»)exp(4KCOs6»)d6», (7.12) 
ttc^(k) J 

Ik pit 

— / (1 -cos20)(l -cos6>)exp(4Kcos6>)d6>. (7.13) 
7rcg(/c) J 



a3, K 



As an example, we generated an ER graph with TV = 2500 nodes and edge density of 60% for 
synchronization of rotations in SO(3) with i.i.d. noise following a LangUni(7, k = 7,p). The 
CRB (|7.1[) . which requires complete knowledge of the graph to compute trace(£^), tells us that we 
need p > 2.1% to reach an accuracy level of e — 10 _1 (for comparison, e 2 is roughly 1000 times 
smaller than V3 (|7.4p ) . The simple formula (|7.11[) , which can be computed quickly solely based on 
the graph statistics TV and M, yields p e = 2.2%. 



Conclusions and perspectives 

In this work, we considered the problem of estimating a set of rotations Ri € SO(n) based on pair- 
wise measurements of relative rotations -Rji?J. We provided a framework to study synchronization 
as estimation on manifolds for arbitrary n under a large family of noise models. We established 
formulas for the Fisher information matrix and associated Cramer-Rao bounds of synchronization 
and provided interpretation and visualization tools for them in both the anchored and anchor- 
free scenarios. In the analysis of these bounds, we notably pointed out the high robustness of 
synchronization against random outliers. 

Because of the crucial role of the pseudoinverse of the Laplacian C) of weighted graphs (and their 
traces) in the CRB's we established, it would be interesting to study efficient methods to compute 



such objects, see e.g. [23j. Likewise, exploring the distribution of trace(£^) seen as a random 
variable for various models of random graphs should bring some insight as to which networks are 
naturally easy to synchronize. Expander graphs already emerge as good candidates. 

The Laplacian of the measurement graph plays the same role in bounds for synchronization of 
rotations as for synchronization of translations. Carefully checking the proof given in the present 
work, it is reasonable to speculate that the Laplacian would appear similarly in CRB's for syn- 
chronization on any Lie group, as long as we assume independence of noise affecting different 
measurements and some symmetry in the noise distribution. Such a generalization would in par- 
ticular yield CRB's for synchronization on the special Euclidean group of rigid body motions, 
R 3 x SO(3). 

In future work, we will leverage the formulation of synchronization as an estimation problem 
on manifolds to propose maximum-likelihood estimators for synchronization. Such approaches will 
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result in optimization problems on the parameter manifolds whose geometries we described here. 
By executing the derivations with the Langevin + outliers noise model, this effort should lead to 
naturally robust synchronization algorithms. 
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Registration on SO(n) 

Let Q,Q' £ SO(n). The squared Riemannian distance on SO(n) between Q and Q' is: 

dist 2 (Q,Q') = ||log(Q'Q T )||p = ||log(Q T Q')||F- (A.l) 

By extension, let R = (R u . . . , R N ) £V = SO(n) x • • • x SO(n) and R' = (R[, . . . , R' N ) £ V. The 
squared Riemannian distance between R and R' on V is 

JV 

dist 2 (R,R') = ^2dist 2 (R l ,R' i ). (A.2) 

i=l 

We use the notation RQ = (RiQ, . . . , RnQ) € V . Recall that V$ is the quotient manifold defined 
as V% = V I ~, with the equivalence relation R ~ R 3Q £ SO(n) : R = R'Q. Then, the 
squared Riemannian distance between the equivalence classes [R] and [R'] is defined in terms of 
the distance on ? as: 

dist 2 ([R], [R]) = min dist 2 (RQ, R'Q'). (A.3) 

Q,Q'eso(ri) 

That is, the distance between the two sets [R] and [R'] is the distance separating two of the closest 
points HQ £ [R] and R'Q' £ [R'] of these sets. This is sometimes referred to as registration on 
SO(n), as our goal is to align R and R' such that they look as similar as possible. One only 
needs to optimize over either Q or Q 1 . Choosing to optimize over Q, we further develop the latter 
expression: 

N 

dist 2 ([RUR'l) = min dist 2 (Q,R')= min V dist 2 (i?;Q, R') (A.4) 
u J 1 u oeso(n) ; oeso(n)^ v ^' v ; 

N N 

= min V \\\og(Q T RjR' i )\\l = min V dist 2 (Q, RFbZ). (A.5) 

OeSO(n)^ 11 llF QGSO(n)^ 

Hence, an optimal Q is a Marcher mean — or intrinsic mean or Riemannian center of mass — of the 
rotation matrices RjR[, ■ ■ ■ , Rj[R' N . 

Although the minimum value is always well-defined because SO(n) is compact, the optimal 
rotation Q may, in general, not be unique. Manton [24[ gives a simple condition under which the 
Karcher mean on compact Lie groups is unique, and gives a convergent algorithm to compute it. 
On SO(n), this result reads: 
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Theorem A.l. (Manton 2004) Let RjR[ , . . . , RjjR' N G SO(n). If there exists Y G SO(n) and 

< r < 7r/2 such that dist(Y", RjR^) < r for i = 1 . . . N, then (i) the Karcher mean, i.e., the global 
minimizer of (|A.5[) . is unique, and (ii) the sequence (Qk)k=o,i,2... defined by 

Q k+1 = Q k cxp f 1 l og(QjRjRdj (A.6) 
wit/i initial condition Qq = RjR[ converges to the global minimizer. 

Proof. See Thm. 5]. □ 

An easy — but possibly inconclusive — way of checking the condition a posteriori after K iter- 
ations is to take Y := Qk in the theorem condition and verifying that dist (Y, RjR'i) < n/2 for 

1 = 1 . . . N . This result may not be optimal, and faster algorithms certainly exist. For more 
information regarding Karcher means, see pi [30|. 

For n = 2, Since SO(2) is Abelian (the product of rotations in the plane is commutative), 
the Karcher mean is easy to compute. Indeed, for commuting matrices A,B, we have \og(AB) = 
log (A) + \og(B) and exp(A + B) = exp(^4) exp(B). Then, the algorithm in Theorem I A. 1 1 converges 
in one iteration. Forcing Qq = I yields an analytical formula for the Karcher mean of rotations in 
the plane. For SO(3), the maximum distance between two rotations is y/2n, hence the uniqueness 
condition in Theorem IA.1I is not always satisfied, but is rather permissive in the context of our 
application. 

A simpler approach to registration uses the Frobcnius norm in the embedding space K nx " of 
SO(n) instead of Riemannian distances: 

TV N 

embdist 2 ([R],[R'l) = min V \\RiQ - Bl\\l = min V \\Q - RjRL\\l . (A.7) 

QGSO(n) f-f l " F QGSO(n)^-' 11 l|lF 

The second equality holds since the Frobcnius norm is invariant under orthogonal transformations. 
The optimal Q — which we now refer to simply as Q — is what is sometimes called the projected 
Euclidean mean of the N rotation matrices RjR[, i = 1...N. Indeed, ignoring the constraint 
Q G SO(n), the solution would be given by P G W ixn : 



1 N 

p =nY. r J r >- ( a - 



i=i 



A number of authors [271 133| provide closed- form expressions for Q and study its existence and 
uniqueness. Reformulating [33l . Prop. 3.3] in terms of the SVD decomposition of P, we may express 
Q in all cases as: 

P = USV T (SVD decomposition), J = r^™ -1 ^yyijj , Q = UJV T , (A.9) 
which essentially comes down to normalizing the singular values of P so as to make it orthogonal, 



and ensuring the result has determinant +1. The rotation matrix Q is uniquely defined by (|A.9[) 
if cither of these conditions is met: 

(i) &et(UV T ) = +1 and the multiplicity of as a singular value of P is at most 1. 

(ii) det(UV T ) = — 1 and the multiplicity of the smallest singular value of P is 1. 
Substituting (|A.9j) in (|A.7j) . we get: 



N / N \ 

cmbdist 2 ([R], [R']) = II RiQ ~ R'i II f = 2Nn - 2tracc ( Q T Y1 R ^ R 'i ) 

i=l \ i=l / 



(A.10) 



= 2Nn - 2A^trace( JS) (A.ll) 
= 2N(n — (Ti - o-2 cr„_i - sa n ), (A. 12) 
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where o\ > a 2 > • • • > <r n are the singular values of P and s — dct(UV T ). 

Registration with the embedded distance has the advantage of being straightforward to compute 
and uniquely defined in all but pathological cases. Likewise, there is a natural embedded distance 
for V A - 

cmbdist 2 (R,R') II Ri- R i IIf- ( A - 13 ) 



Density normalization 

This appendix presents the derivation of the normalization coefficients c„(re) (|4.6[) - (|4.8[) that appear 
in the probability density function (|4.4p . Recall that the coefficient c n (n) is given by (|4.5[) : 



Ch(k) = / cxp (Ktrace(Z)) d/i(Z), (B-l) 

JSO(n) 

where d/i is the normalized Haar measure over SO(n) such that / SO ( n ) d/z(.Z) = 1. In particular, 
c„(0) = 1. The integrand, g(Z) = exp (retrace (if)), is a class function, meaning that for all 
Q, Z £ SO(n) we have g(Z) = g(QZQ~ 1 ). We are thus in a position to use the Weyl integration 
formula specialized to SO(n) [9|, Exercise 18.1-2]. Formula (|4.3[) applies, 

/ g{Z)d l x{Z) = — ^ ^ T ~g(e 1 ,e 2 )-\e^-e^\ 2 -\e ie --e- ie ^d6 1 de 2 , (B.2) 



'SO(4) 

where we defined 



, /cos 0i -sin 0i \ /cos 2 -sin0 2 \ . 
g(0i,0 2 ) = g diag , . . (B.3) 



smtfi cost/i I \ smv 2 cost/2 

This reduces the problem to a classical integral over the square — or really the torus — [— 7r,7r] x 
[— 7r,7r]. Evaluating g is straightforward: 

3(01,02) = exp (2k- [cos0i +cos0 2 ]). (B.4) 

Using trigonometric identities, we also get: 

| e *i_ e *fe|2.| e *i_ e -tf a |2 
= 4(1 - COS(0! - 6 2 )) (1 - COS(0! + 2 )) 

= 4(1 - cos(0i - 2 ) - cos(0i + 2 ) + cos(0i - 2 ) cos(0i + 2 )) 

= 4 (l - 2 cos 0i cos 2 + g ( cos 20 i + cos 20 2 )j. (B.5) 



Each cosine factor now only depends on one of the angles. Plugging (|B.4|) and (|B.5[) in (|B.2|) and 
using Fubini's theorem, we get: 



C4(«) = ^~ f ' 



r "" -h(fli) d0i, (B.6) 



with: 



Kh) = 7T I e 2KCOsf>2 (l + icos20i-2cos0 lC os0 2 + icos20 2 ) d0 2 . (B.7) 
27r •/-•*■ V 2 2 / 

Now recalling the definition of the modified Bessel functions of the first kind [39[ , 

Iu(x) = ^-I e xcosB cos(z/0)d0, (B.8) 

2?r J-7T 
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we further simplify h to get: 



1 + ]^cos26 l j •Jo(2«)-2cos0i-/i(2k) + --/ 2 (2k). (B.9) 



Plugging (|B.9I) in (|B.6|) and resorting to Bcsscl functions again, we finally obtain the practical 
formula (|4.8[) for C4(«): 



c 4 (re) 



7 (2k) + i/ 2 (2/t) 
*-~\2 _ or. ro^2 



7 (2k) - 27i(2k) ■ /i(2k) + \l {2n) ■ I 2 (2k) 



I {2k) 2 - 2Ii{2kY + I {2k)I 2 {2k). (B.10) 



For generic n, the necessary manipulations are very similar to the developments in this ap- 
pendix. For n = 2 or 3, the computations are easier. For n = 5, the computations take up about 
the same space. For n > 6, the same technique will still work but gets quite cumbersome. 

In (IH Appendix A. 6], Chikuse describes how the normalization coefficients for Langevin dis- 
tributions on 0(n) can be expressed in terms of hypcrgcometric functions with matrix arguments. 
One advantage of this method is that it generalizes to non-isotropic Langevin's. The method we 
demonstrated here, on the other hand, is tailored for our need (isotropic Langevin's on SO(n)) 
and yields simple expressions in terms of Bessel functions — which are readily available in Matlab 
for example. Furthermore, this method applies well to compute the other integrals we need in this 
paper so that this appendix can be seen as an example. 

C Curvature terms 

Cramer-Rao bounds for estimation problems on manifolds include extra terms in comparison to the 
classical CRB's on Euclidean spaces 0,11(1 41 1. These terms stem from the possible curvature of 



the parameter space. As they can be quite tedious to compute, it is tempting to neglect curvature 
terms. One way to do this is to invoke high enough SNR's such that typical errors would v erify 
dist 2 (R, R) -C ^miLc, where if max is the largest sectional curvature of the parameter space j36j . 
Indeed, in this regime, error terms are small enough that curvature has essentially no effect. 
Unfortunately, for synchronization, K~^ x does not grow with N. Assuming we are interested in 
SNR's such that estimators commit a typical error on each Ri of the order of, say, 1 degree, a 
typical dist 2 (R, R) grows linearly with TV (the number of rotations) . The bound remaining 
constant w.r.t. N, we would quickly reach values of ./V such that curvature cannot be legitimately 
neglected via this argument. 

Consequently, we compute the curvature terms from theorems 4 and 5 in Q for n = 2 and 
n = 3 explicitly. Our conclusion will be that it is legitimate to neglect them, at reasonable SNR's. 
This is mainly a consequence of the product nature of V. Indeed, sectional curvatures on product 
manifolds vanish for most tangent 2-planes (more precisely, for all planes extending over more 
than one of the underlying terms of the product). Describing the curvature of such spaces only 
through a global bound K max hides this important structure. By investigating the curvature terms 
in detail, we capture the geometry accurately. For the same reason, we expect that curvature terms 
are negligible for n > 4 too, but we do not conduct the calculations. 

We first treat Va (j3.14[) . then V% (|3.18[) . We show that for rotations in the plane (n = 2), the 
parameter spaces are flat, so that curvature terms vanish exactly. For rotations in space (n = 3), 
we compute the curvature terms explicitly, and show that they are on the order of 0(SNR~ 2 ), 
whereas dominant terms in the CRB are on the order of (^(SNR" 1 ). 

C.l Curvature terms for Va 

The manifold Va is a (product) Lie group. Hence, the Riemannian curvature tensor 1Z of Va on 
the tangent space TrVa is given by a simple formula [28], Corollary 11.10, p. 305]: 

(rc(x,n)n,x) = i||[x,o]f, (c.i) 
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where [X, SI] is the Lie bracket of X = (X\, . . . , Xn) and SI = (Sli, . . . , SIn), two vectors (not 
necessarily orthonormal) in the tangent space T-rVa ■ Following 0, Theorem 4] , in order to compute 
the curvature terms for the CRB of synchronization on Va, we first need to compute 

K m [Sl\Sl'] ^E{(TZ(X,PaSI')P a SI\X)}, (C.2) 

where SI' is any tangent vector in TrP and Pa^' is its orthogonal projection on TrPa- We 
expand X and SI = PaSI' using the orthonormal basis (£kt)k=i...N,t=i...d (EU 01 TrP D TrPa- 

SI = ^ auju and X = y^fffcigfci, (C.3) 

k,e k,e 

such that Sl k = R k Y^ t a kt Ei and X k = R k J2 e /3ktEi. Of course, a ke = /3 ke = Vfc g A. Then, 
since 

[x, si] = ([x 1 ,n 1 ],...,[x N ,n N ]), 

it follows that: 

R„JM! e{1 X.12 'J- .V,.<>, ! j (C.4) 

= lE E i ||l>ttjM^.] 2 }■ ( C - 5 ) 



For X the tangent vector in TrPa corresponding to the (random) estimation error Log R (R), the 
coefficients ftke are random variables. The covariance matrix Ca (|6.2p is given in terms of these 
coefficients by: 

(C A )kk'jr = E {(X, £ M ) (X, = E {Pufcf} • (C6) 

The goal now is to express the entries of the matrix associated to R m as linear combinations of 
the entries of Ca- 

For n = 2, of course, R m = since Lie brackets vanish thanks to the commutativity of rotations 
in the plane. For n = 3, the special structure of SO(3), namely its constant curvature, leads to 
nice expressions, which we obtain now. 

For SO(3), let us consider the orthonormal basis (E\, E 2 , E 3 ) of so(3) (|5.5[) . Observe that 
it obeys [Ei,E 2 ] = E 3 /V2,[E 2 , E 3 ] = E x /y/2, [E 3 , E{\ = E 2 /V2. As a result, equation (|C~5|) 
simplifies and becomes: 

R m [fi,ri] = i^E{K 2 /3 fc3 - a k3 Pk2? + KsAi - afciAa) 2 + Ki/? fc2 - a k2 p kl f) . (C.7) 

We set out to compute the dN x dN matrix R m associated to the bi-linear operator R m w.r.t. the 
basis (|5.6j) . By definition, {R m )kk',w = Rm^M, tik'l']- Equation IC.7I readily yields the diagonal 
entries (k = k' ,£ = £'). Using polarization to determine off-diagonal entries [7J, eq. (48)], 

(Rm)kk'.U' = ^(Rm[6f + &k'l',&kl + £k't'] — R-m [£kt ~ $k'V,$M ~ £k't']), (C8) 

it follows through simple calculations (taking into account the orthogonal projection onto TrPa 
that appears in (|C.2[) ) that: 

{Rm)kk',w = I -\{C A ) k k,u' i£k = V$A,£?£', (C.9) 
I otherwise. 
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Hence, R m (CA) is a block-diagonal matrix whose nonzero entries are linear functions of the entries 
of Ca- Theorem 4 in Q requires (|C.9|) to compute the matrix R m {F\). Considering the special 
structure of the diagonal blocks of F\ (|6.5[) (they are proportional to I3), we find that 

R m {F\) = iddiag(F)) = jddiag(£^) g> I 3 , (CIO) 

where ddiag puts all off-diagonal entries of a matrix to zero. Thus, as the SNR goes up and hence 
as C\ goes down, the curvature term R m (F^)F^ + F^R m (F\) in [3, Thm. 4] will become negligible 
compared to the main term in the CRB, F\. 



C.2 Curvature terms for V§ 

The manifold V% is a quotient manifold of V . Hence, the Riemannian curvature tensor 1Z of V% is 
given by O'Neill's formula (2|| Thm 7.47, p. 213 and Lemma 3.39, p. 77], showing that the quotient 
operation can only increase the curvature of the parameter space: 

(ft(D7rX,D7rft)D7rfJ,D7rX) = i|| [X, SI] || 2 + ^|| [X, f2] v || 2 , (C.ll) 

where X, f2 are horizontal vectors in Hr C Tr"P associated to tangent vectors to V% via the 
differential of the Riemannian submersion D7r(R) (|3.20[) . denoted simply as D7r for convenience. 
The vector [X, fi] v G Vr C TrT 3 is the vertical part of [X, f2], i.e., the component that is parallel 
to the fibers. Since in our case, moving along a fiber consists in changing all rotations along the 
same direction, [X, 17] v corresponds to the mean component of [X, Si\: 

1 N 

[X,H] V = {R lU ,...,R N u), withu;=-^[iiJX fe ,i?JO fc ]. (C.12) 

fe=i 

For n = 2, since [X, ft] — 0, [X, H] v = also, hence is still a flat manifold, despite the 
quotient operation. We now show that for n — 3 the curvature terms in the CRB @, Thm. 5] are 
equivalent to the curvature terms for Va with A := plus extra terms that decay as 1/N and can 
thus be neglected. 

The curvature operator R m 0, eq. (54)] is given by: 

RmfcM.Gw] =E{<7i(D7rX,D7r€fc<)D7re W) D7rX)} (C.13) 
= E |i||[X,^ - ^]|| 2 + fllPUw - ^] V H 2 } • (CM) 

The tangent vector £ ke — £Yg is, by construction, the horizontal part of £k£- The vertical part 
decreases in size as N grows: 



*kt\ 



—{R\Ei, . . . , RnE^) 



2 = 1_ 
~ N' 



(C.15) 



It follows that: 



E {II [X, C w - & ] || 2 } = E { || [X, £ fc£ ]|| 2 } + 0{1/N). (C.16) 



Hence, up to a factor that decays as 1/N, the first term in the curvature operator R m is the same 
as that of the previous section for Va, with A := 0. We now deal with the second term defining 
R m : 

[X, ^,] v = {Rm, Rnuj), with (C.17) 
u> = ±[RjX k ,E e ] = ±Y,l3 ke [E.,E e ]. (C.18) 
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It is now clear that for large N this second term is negligible compared to E {|| [X, £fcf]|| 2 }: 

||[X,^] V || 2 = 7V|M| 2 =0(1/AO. (C19) 



Applying polarization to R m to compute off-diagonal terms then concludes the argument showing 
that the curvature terms in the CRB for synchronization of rotations on despite an increased 
curvature tensor owing to the quotient operation (|C.11J) . are very close (within a 0(1/N) term) to 
the curvature terms established earlier for synchronization on Va, with A := 0. We do not include 
an exact derivation of these terms as it is quite lengthy and does not bring much insight to the 
problem. 

Proof that G tJ (QZQ T ) = QG tJ (Z)Q T and that G tJ (Z T ) = -G l3 (Z). 



Recall the definition of Gy : SO(n) — > so(n) 

G ij (Z) = [grad log f^Z^Z. (D.l) 

Now let us introduce a few functions: 

.9 : SO(n) -)I:Zhi g(Z) = log f^Z), (D.2) 
hi : SO(n) SO(n) : Z ^ h x {Z) = QZQ 1 , (D.3) 
h 2 : SO(n) ->■ SO(n) : Z H- h 2 (Z) = Z T . (D.4) 

Notice that because of Assumption [3] (fij is only a function of the eigenvalues of its argument), we 
have g o hi = g for i = 1,2. Hence, 

grad. 9 (Z) = grad( 9 o h t ){Z) = (Dhi(Z))* ferad fl (^(Z))] , (D.5) 

where (Dhi(Z))* denotes the adjoint of the differential Dhi(Z), defined by 

VHi,H 2 e T z SO( ? i), (T>hi(Z)[Hi\,H a ) = {H u (Dhi(Z))*[H 2 }) . (D.6) 

The rightmost equality of (|D.5|) follows from the chain rule. Indeed, starting with the definition 
of gradient, we have: 

VF G T z SO(n), (grad(.g o h t ){Z), H) = B{g o h t )(Z)[H] 

= Dg(h i (Z))[Dh i (Z)[H}} 

= (grad 5 (/ li (Z)),D/i i (Z)[ff]) 

= ((Dfti(Z))* [grad c?(/ii(Z))], if) . (D.7) 

Let us compute the differentials of the h^s and their adjoints: 

m 1 (Z)[H}=QHQ T 7 {Dh 1 (Z))*[H} = Q T HQ, (D.8) 

m 2 {Z)[H] = H T 7 (Dh 2 (Z))*[H] = H T . (D.9) 



Plugging this in (|D.5[) . we find two identities (one for hi and one for h 2 ): 

grad log/. y (Z) = Q T [grad log fij (QZQ T )]Q, (D.10) 
grad log f tj (Z) = [grad log / y (Z T )] T . (D.ll) 

The desired result about the G^'s now follows easily. For any Q £ O(n), 

G tl (QZQ T ) = [grad log f tl (QZQ T )] T QZQ T ^ [Qgrad log f VJ (Z)Q T ] T QZQ T = QGij (Z)Q T . 

(D.12) 

And similarly: 

Gy(Z T ) = [grad log f lJ (Z T )] T Z T = grad log/ lJ (Z)Z T = ZGj(Z)Z T = -ZG y (Z)Z T = -G -(Z), 
where we used that Gij(Z) is skew-symmetric and we used (|D.12[) for the rightmost equality. 



31 



Proof of lemma 15.2 



Proof. We give a constructive proof, distinguishing among three cases. (Case 1: {i,j} n {k,£} = 
). Construct T as the identity /„ with columns i and k swapped, as well as columns j and £. 
Construct S as I n with S u := -1. By construction, it holds that T T ET = E' , T T E'T = E, 
SES = -E and SE'S = E'. Set P = TS to conclude: P T EP = ST T ETS = SE'S = E', 
P T E'P = ST T E'TS = SES = -E. (Case 2: i = k,j ^ £). Construct T as the identity I n with 
columns j and £ swapped. Construct S as /„ with Sjj := — 1. The same properties will hold. Set 
P = TS to conclude. ( Case 3: i = £, j ^ k). Construct T as the identity with columns j and k 
swapped and with Tu := — 1. Construct S as /„ with Sjj := — 1. Set P = TS to conclude. (Cases 
4 and 5: j = k or j = £). The same construction goes through. □ 
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